44 research outputs found

    Off-Policy Actor-Critic with Emphatic Weightings

    Full text link
    A variety of theoretically-sound policy gradient algorithms exist for the on-policy setting due to the policy gradient theorem, which provides a simplified form for the gradient. The off-policy setting, however, has been less clear due to the existence of multiple objectives and the lack of an explicit off-policy policy gradient theorem. In this work, we unify these objectives into one off-policy objective, and provide a policy gradient theorem for this unified objective. The derivation involves emphatic weightings and interest functions. We show multiple strategies to approximate the gradients, in an algorithm called Actor Critic with Emphatic weightings (ACE). We prove in a counterexample that previous (semi-gradient) off-policy actor-critic methods--particularly Off-Policy Actor-Critic (OffPAC) and Deterministic Policy Gradient (DPG)--converge to the wrong solution whereas ACE finds the optimal solution. We also highlight why these semi-gradient approaches can still perform well in practice, suggesting strategies for variance reduction in ACE. We empirically study several variants of ACE on two classic control environments and an image-based environment designed to illustrate the tradeoffs made by each gradient approximation. We find that by approximating the emphatic weightings directly, ACE performs as well as or better than OffPAC in all settings tested.Comment: 63 page

    Label Alignment Regularization for Distribution Shift

    Full text link
    Recent work reported the label alignment property in a supervised learning setting: the vector of all labels in the dataset is mostly in the span of the top few singular vectors of the data matrix. Inspired by this observation, we derive a regularization method for unsupervised domain adaptation. Instead of regularizing representation learning as done by popular domain adaptation methods, we regularize the classifier so that the target domain predictions can to some extent ``align" with the top singular vectors of the unsupervised data matrix from the target domain. In a linear regression setting, we theoretically justify the label alignment property and characterize the optimality of the solution of our regularization by bounding its distance to the optimal solution. We conduct experiments to show that our method can work well on the label shift problems, where classic domain adaptation methods are known to fail. We also report mild improvement over domain adaptation baselines on a set of commonly seen MNIST-USPS domain adaptation tasks and on cross-lingual sentiment analysis tasks

    The Tunnel Effect: Building Data Representations in Deep Neural Networks

    Full text link
    Deep neural networks are widely known for their remarkable effectiveness across various tasks, with the consensus that deeper networks implicitly learn more complex data representations. This paper shows that sufficiently deep networks trained for supervised image classification split into two distinct parts that contribute to the resulting data representations differently. The initial layers create linearly-separable representations, while the subsequent layers, which we refer to as \textit{the tunnel}, compress these representations and have a minimal impact on the overall performance. We explore the tunnel's behavior through comprehensive empirical studies, highlighting that it emerges early in the training process. Its depth depends on the relation between the network's capacity and task complexity. Furthermore, we show that the tunnel degrades out-of-distribution generalization and discuss its implications for continual learning.Comment: NeurIPS 202

    Sonographic and functional characteristics of thyroid nodules in a population of adult people in Isfahan

    Get PDF
    Wst臋p: Celem badania by艂a ocena cech sonograficznych zmian ogniskowych tarczycy u mieszka艅c贸w Isfahanu, obszaru w cenralnym Iranie, kt贸ry wcze艣niej charakteryzowa艂 si臋 niedoborem jodu. Materia艂 i metody: W przekrojowym badaniu przeprowadzonym w 2006 roku wybrano pr贸b臋 licz膮c膮 2523 doros艂ych os贸b (wiek > 20 lat) metod膮 wielostopniowego losowania grupowego. Spo艣r贸d tej grupy, 263 ochotnik贸w poddano badaniom sonograficznym. Badanie tarczycy przeprowadzili do艣wiadczeni specjali艣ci w zakresie ultrasonografii. Ponadto oznaczono st臋偶enia T3, T4, T3RU, TSH, TPO Ab i Tg Ab w surowicy oraz wydalanie jodu z moczem. Wyniki: Kobiety stanowi艂y 46% grupy poddanej badaniom sonograficznym (n = 263). 艢rednia wieku wynosi艂a 35,5 lat (zakres 20-64 lat). Mediana st臋偶enia jodu w moczu wynosi艂a 19.4 μg/dl. Obecno艣膰 zmian ogniskowych tarczycy wykazano w badaniu sonograficznym u 22,4% os贸b z badanej grupy; u 30% kobiet i 16,3% m臋偶czyzn (OR = 2,2; p = 0,01). Cz臋sto艣膰 wyst臋powania zmian ogniskowych tarczycy zwi臋ksza艂a si臋 z wiekiem (p = 0,006). Zmiany ogniskowe tarczycy wyst臋powa艂y cz臋艣ciej u os贸b z niedoczynno艣ci膮 tarczycy ni偶 w grupie z eutyreoz膮 (35,1% v. 20,5%, OR = 2,1; p = 0,04). Nie stwierdzono korelacji mi臋dzy st臋偶eniem jodu w moczu ani st臋偶eniem autoprzeciwcia艂 a wyst臋powaniem zmian ogniskowych tarczycy w badaniu sonograficznym. Wnioski: Cz臋sto艣膰 wyst臋powania zmian ogniskowych tarczycy oceniana na podstawie wynik贸w badania sonograficznego jest nadal du偶a w badanej populacji, mimo prawid艂owego st臋偶enia jodu w moczu. (Endokrynol Pol 2010; 61 (2): 188-191)Introduction: The aim of this study was to investigate the current status of sonographic characteristics of thyroid nodules in Isfahan, a previously iodine deficient area in central Iran. Material and methods: In a cross-sectional study conducted in 2006, 2523 adult people (age > 20 years) were selected by a multistage clustering sampling method. Of these people, 263 volunteered persons were underwent sonographic evaluation. Thyroid examination was done by two expert sonographers. Serum T3, T3, T3RU, TSH, TPO Ab and Tg Ab, and urinary iodine were measured. Results: Forty-six per cent of the 263 people were women. Their mean age was 35.5 years with a range of 20-64 years. Median urinary iodine was 19.4 μg/dL. The prevalence of thyroid nodules on sonography was 22.4% in the whole group; 30% in women and 16.3% in men (OR = 2.2, P = 0.01). The prevalence of thyroid nodules increased with age (P = 0.006). The prevalence of thyroid nodules was higher in hypothyroid people than in euthyroid people (35.1% v. 20.5%, OR = 2.1, P = 0.04). Neither urinary iodine nor autoantibody concentrations correlated with the prevalence of thyroid nodules in sonography. Conclusions: The prevalence of thyroid nodule by sonography is still high despite relatively normal urinary iodine in this population. (Pol J Endocrinol 2010; 61 (2): 188-191

    Does increased Nitric Oxide production and oxidative stress due to high fat diet affect cardiac function after myocardial infarction?

    Get PDF
    Background &Objectives: High fat (HF) diet by affecting the oxidative stress and nitric oxide (NO) production may lead to different effects on function of the heart after myocardial infarction (MI). In the present study we aimed to address the hypothesis that high release of NO by activated macrophages affects LV function after MI.Methods: The animals were randomly divided into four groups comprising each of 10 rats: 1) Sham; 2) MI; 3) Sham+ HF diet; 4) MI+ HF diet. Animals fed with HF diet 30 days before sham and MI surgery. MI was induced by permanent ligation of left anterior descending coronary artery (LAD). Nitric oxide (NO) production of peritoneal macrophages, the concentrations of MDA in the heart and the infarct size were measured.Results: Our study indicated that HF has adverse effects on myocardium and it may increase NO production as well as oxidative stress, resulting in augmentation of infarct size.Conclusion: Our results add to our knowledge that HF diet was associated with overproduction of NO by peritoneal macrophages and ROS that lead to development of infarct size and adverse remodeling
    corecore